3574 results found.
Multimodal/Multimedia
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
CreativeCommons
Size:
2000 hoursProduction Status:
Existing-used
Use:
Summarisation
-
Paper title:Multimodal Speech Summarization through Semantic Concept Learning
-
Paper track:12.10 Spoken document summarization/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Shruti Palaskar | How2 Dataset | /N |
Documentation:
English
Speech
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
From Data Center(s)
License:
LDC
Size:
None Production Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:Universal Speaker Extraction in the Presence and Absence of Target Speakers for Speech of One and Two Talkers
-
Paper track:5.8 Source separation/Poster Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Marvin Borsdorf | CSR-I (WSJ0) Complete | /N |
Documentation:
Yes, in English. Available at: https://catalog.ldc.upenn.edu/docs/LDC93S6A/
Speech
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
License:
LDC
Size:
260 hoursProduction Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:Differentiable Allophone Graphs for Language Universal Speech Recognition
-
Paper track:9.8 Cross-lingual and multilingual components for /Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Brian Yan | Switchboard | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
CC BY 4.0
Size:
1000 hoursProduction Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:Multi-mode Transformer Transducer with Stochastic Future Context
-
Paper track:8.6 Neural network training methods (including new/Poster Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Kwangyoun Kim | Librispeech | /N |
Documentation:
None
Multimodal/Multimedia
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
MIT
Size:
648 hoursProduction Status:
Existing-used
Use:
Video captioning
-
Paper title:Optimizing latency for online video captioning using multi-modal Transformers
-
Paper track:5.3 Acoustic event detection and acoustic scene cl/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Chiori Hori | ActivityNet | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Bilingual
Languages:
English Mandarin Chinese
Availability:
From Data Center(s)
License:
Size:
1000 hoursProduction Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition
-
Paper track:8.1 Feature extraction and low-level feature model/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Max W. Y. Lam | AISHELL-2 | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
CreativeCommons
Size:
14.1 GByteProduction Status:
Existing-used
Use:
Speech Synthesis
-
Paper title:One-shot Voice Conversion with Speaker-agnostic StarGAN
-
Paper track:7.10 Voice modification, conversion and morphing/Poster Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Sefik Emre Eskimez | voice clone toolkit | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
Czech English German
Availability:
Freely Available
License:
CreativeCommons
Size:
10 hoursProduction Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Lost in Interpreting: Speech Translation from Source or Interpreter?
-
Paper track:12.1 Spoken machine translation/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Matúš Žilinec | ESIC | /N |
Documentation:
documentation in English
Speech
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
From Data Center(s)
License:
Size:
18 hoursProduction Status:
Existing-used
Use:
Evaluation/Validation
-
Paper title:Noise robust acoustic modeling for single-channel speech recognition based on a stream-wise transformer architecture
-
Paper track:8.3 Robustness against noise or reverberation/Poster Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Masakiyo Fujimoto | CHiME4 | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
OpenSource
Size:
304 GByteProduction Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:IR-GAN: Room impulse response generator for far-field speech recognition
-
Paper track:8.4 Far field and microphone array speech recognit/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Anton Ratnarajah | Librivox | /N |
Documentation:
There is an ICASSP paper on this dataset




